Geographic Topic Model: Appendix

نویسنده

  • Jacob Eisenstein
چکیده

Faceted topic models combine topical content with extraneous facets, such as ideology or dialect. In this model, the “pure” topics are corrupted by the facets, using a hierarchical generative model in which the pure topics act as priors on the faceted topics. This is most easily modeled using the logistic-normal distribution, which admits a normal prior on the mean. 1 Model We build on latent Dirichlet allocation: • For each document d, draw θd ∼ Dirichlet(αθ) • For each token n < Nd – Draw a topic index zn ∼ θd – Draw a word token from the associated topic wn ∼ βzn We augment each document with an additional discrete facet variable, vd ∼ θ, which selects the appropriate faceted topic (θ ∼ Dir(αθ)): wn ∼ βd zn . Here zn indexes the topic and vd indicates the facet of the topic: for example, vd may select the “Detroit” version of the “electronic music” topic. We may also introduce metadata yd, such that yd ∼ f(y;ρvd), indicating an arbitrary probability distribution with parameters ρvd . In the geographical topic model, f(y;ρvd) takes the form of a bivariate Gaussian; variational inference in this setting is deferred to [4]. The faceted topics take logistic-Normal priors, such that β k ∼ LN (μk,σk). It will be convenient to introduce the auxiliary variable η, such that β = expη/ ∑ i exp η[i] and η (j) k ∼ N (μk,σk). Throughout, i will index the word, j will index the facet, and k will index the topic. We draw the pure topics from Normal priors, μk ∼ N (a, b)), and the topic-variances from Gamma priors, σ k[i] ∼ G(c, d). 2 Variational Approximation We’ll make a fully-factorized approximation: Q(z,v,θ,θ,η,μ,σ) =  K ∏

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Portraits of Livia in context: an analysis of distribution through the application of geographic information systems

APPENDIX D TABLES 94 APPENDIX E FIGURES 100 BIBLIOGRAPHY 137 iii LIST OF TABLES Table D1. Frequency of portrait types by geographic region. 94 Table D2. Frequency of portraits of Livia in context by portrait type. 95 Table D3. Frequency of the portrait types of Livia by date. 96 Table D4. Type of city with portraits of Livia by region of the Roman Empire. 97 Table D5. Architectural context of t...

متن کامل

Technical Details of a Domain-independent Framework for Modeling Emotion

This technical report elaborates on the technical details of the EMA model of emotional appraisal and coping. It should be seen as an appendix to the journal article on this topic (Gratch & Marsella, to appear)

متن کامل

Probabilistic Topic Modeling in Multilingual Settings: A Short Overview of Its Methodology and Applications

Probabilistic topic models are unsupervised generative models that model document content as a two-step generation process, i.e., documents are observed as mixtures of latent topics, while topics are probability distributions over vocabulary words. Recently, a significant research effort has been invested into transferring the probabilistic topic modeling concept from monolingual to multilingua...

متن کامل

Why Geographic Factors are Necessary in Development Studies

This paper proposes that the resurgence of geographic factors in the study of uneven development is not due simply to the recurrent nature of intellectual fashions, nor necessarily because arguments that rely on geographic factors are less simplistic than before, nor because they avoid racialist, imperialistic, and deterministic forms they sometimes took in the past. Rather, this paper argues t...

متن کامل

A Latent Variable Model for Geographic Lexical Variation

The rapid growth of geotagged social media raises new computational possibilities for investigating geographic linguistic variation. In this paper, we present a multi-level generative model that reasons jointly about latent topics and geographical regions. High-level topics such as “sports” or “entertainment” are rendered differently in each geographic region, revealing topic-specific regional ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011